Clustered Modulo Scheduling in a VLIW Architecture with Distributed Cache

نویسندگان

  • F. Jesús Sánchez
  • Antonio González
چکیده

Clustering is an approach that many microprocessors are adopting in recent times in order to mitigate the increasing penalties of wire delays. In this work we propose a novel clustered VLIW architecture which has all its resources partitioned among clusters, including the cache memory. A modulo scheduling scheme for this architecture is also proposed. This algorithm takes into account both register and memory inter-cluster communications so that the final schedule results in a cluster assignment that favors cluster locality in cache references and register accesses. It has been evaluated for both 2and 4-cluster configurations and for differing number and latencies of inter-cluster buses. The proposed algorithm produces schedules with very low communication requirements and outperforms previous cluster-oriented schedulers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effectiveness of Loop Unrolling for Modulo Scheduling in Clustered VLIW Architectures

Clustered organizations are becoming a common trend in the design of VLIW architectures. In this work we propose a novel modulo scheduling approach for such architectures. The proposed technique performs the cluster assignment and the instruction scheduling in a single pass, which is shown to be more effective than doing first the assignment and later the scheduling. We also show that loop unro...

متن کامل

Partitioned Schedules for Clustered VLIW Architectures

This paper presents results on a new approach to partitioning a modulo-scheduled loop for distributed execution on parallel clusters of functional units organized as a VLIW machine. A distinctive characteristic of this architecture is the use of register files organized by means of queues, which results in a number of advantages over conventional schemes, but also requires the development of sp...

متن کامل

Thesis - Vasileios Porpodas

Very Long Instruction Word (VLIW) processors are wide-issue statically scheduled processors. Instruction scheduling for these processors is performed by the compiler and is therefore a critical factor for its operation. Some VLIWs are clustered, a design that improves scalability to higher issue widths while improving energy efficiency and frequency. Their design is based on physically partitio...

متن کامل

Smart Memory Management through Locality Analysis

Cache memories were incorporated in microprocessors in the early times and represent the most common solution to deal with the gap between processor and memory speeds. However, many studies point out that the cache storage capacity is wasted many times, which means a direct impact in processor performance. Although a cache is designed to exploit different types of locality, all memory reference...

متن کامل

On the Effectiveness of the Scheduling Algorithm of the Dynamically Trace Scheduled VLIW Architecture

In a machine that follows the dynamically trace scheduled VLIW (DTSVLIW) architecture, VLIW instructions are built dynamically through a scheduling algorithm that can be implemented in hardware. These VLIW instructions are cached so that the machine can spend most of its time executing VLIW instructions without sacrificing any binary compatibility. This paper evaluates the effectiveness of the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Instruction-Level Parallelism

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2001